New Baseline for Project Structure #1524
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary of Changes in the "base-requirements" Branch
Added comprehensive configuration in app/core/config.py for multiple databases and services including:
MongoDB configuration with connection URI
Redis settings with connection URI
Pinecone Vector Database settings
Enhanced Celery configuration
Kafka settings
NLP model settings
Updated pyproject.toml with new dependencies for:
Database clients: MongoDB (motor, pymongo), Redis, Pinecone
Message queue and task processing: Celery, Flower, Kafka, RabbitMQ
ML/NLP tools: spaCy, Transformers, Sentence-Transformers, scikit-learn
Data processing: NumPy, Pandas
Added Celery task queue integration with:
app/tasks/celery_app.py: Celery application configuration
app/tasks/worker.py: Task implementations for:
Scraping social media content
Analyzing social data
Generating reports
Processing data pipeline
Modified Dockerfile for the main application
Added Dockerfile.celery for dedicated Celery workers
Updated Docker configuration for multiple services
Reorganized application structure
Added new directories for tasks, API endpoints, and services
Updated README with new project information
Added documentation for the hybrid database architecture
The branch implements a foundation for a Political Social Media Analysis Platform with a hybrid database approach (PostgreSQL, MongoDB, Redis, Pinecone) and data processing capabilities (Celery, Kafka, ML/NLP tools). These changes set up the infrastructure needed for collecting, processing, and analyzing social media content related to political entities.
The most significant changes appear to be in the backend architecture, enabling a more sophisticated data processing pipeline while maintaining the core FastAPI functionality.